Exploring Time-Sensitive Variational Bayesian Inference LDA for Social Media Data
نویسندگان
چکیده
There is considerable interest among both researchers and the mass public in understanding the topics of discussion on social media as they occur over time. Scholars have thoroughly analysed samplingbased topic modelling approaches for various text corpora including social media; however, another LDA topic modelling implementation— Variational Bayesian (VB)—has not been well studied, despite its known e ciency and its adaptability to the volume and dynamics of social media data. In this paper, we examine the performance of the VB-based topic modelling approach for producing coherent topics, and further, we extend the VB approach by proposing a novel time-sensitive Variational Bayesian implementation, denoted as TVB. Our newly proposed TVB approach incorporates time so as to increase the quality of the generated topics. Using a Twitter dataset covering 8 events, our empirical results show that the coherence of the topics in our TVB model is improved by the integration of time. In particular, through a user study, we find that our TVB approach generates less mixed topics than state-of-the-art topic modelling approaches. Moreover, our proposed TVB approach can more accurately estimate topical trends, making it particularly suitable to assist end-users in tracking emerging topics on social media.
منابع مشابه
Algorithms of the LDA model [REPORT]
We review three algorithms for Latent Dirichlet Allocation (LDA). Two of them are variational inference algorithms: Variational Bayesian inference and Online Variational Bayesian inference and one is Markov Chain Monte Carlo (MCMC) algorithm – Collapsed Gibbs sampling. We compare their time complexity and performance. We find that online variational Bayesian inference is the fastest algorithm a...
متن کاملA Collapsed Variational Bayesian Inference Algorithm for Latent Dirichlet Allocation
Latent Dirichlet allocation (LDA) is a Bayesian network that has recently gained much popularity in applications ranging from document modeling to computer vision. Due to the large scale nature of these applications, current inference procedures like variational Bayes and Gibbs sampling have been found lacking. In this paper we propose the collapsed variational Bayesian inference algorithm for ...
متن کاملNeural Variational Inference For Topic Models
Topic models are one of the most popular methods for learning representations of text, but a major challenge is that any change to the topic model requires mathematically deriving a new inference algorithm. A promising approach to address this problem is neural variational inference (NVI), but they have proven difficult to apply to topic models in practice. We present what is to our knowledge t...
متن کاملAccelerating Collapsed Variational Bayesian Inference for Latent Dirichlet Allocation with Nvidia CUDA Compatible Devices
In this paper, we propose an acceleration of collapsed variational Bayesian (CVB) inference for latent Dirichlet allocation (LDA) by using Nvidia CUDA compatible devices. While LDA is an efficient Bayesian multi-topic document model, it requires complicated computations for parameter estimation in comparison with other simpler document models, e.g. probabilistic latent semantic indexing, etc. T...
متن کاملA Comparison of Variational Bayes and Markov Chain Monte Carlo Methods for Topic Models
Latent Dirichlet Allocation (LDA) is Bayesian hierarchical topic model which has been widely used for discovering topics from large collections of unstructured text documents. Estimating posterior distribution of topics as well as topic proportions for each document is the goal of inference in LDA. Since exact inference is analytically intractable for LDA, we need to use approximate inference a...
متن کامل